Discovery of Causal Models that Contain Latent Variables Through Bayesian Scoring of Independence Constraints
نویسندگان
چکیده
Discovering causal structure from observational data in the presence of latent variables remains an active research area. Constraint-based causal discovery algorithms are relatively efficient at discovering such causal models from data using independence tests. Typically, however, they derive and output only one such model. In contrast, Bayesian methods can generate and probabilistically score multiple models, outputting the most probable one; however, they are often computationally infeasible to apply when modeling latent variables. We introduce a hybrid method that derives a Bayesian probability that the set of independence tests associated with a given causal model are jointly correct. Using this constraint-based scoring method, we are able to score multiple causal models, which possibly contain latent variables, and output the most probable one. The structure-discovery performance of the proposed method is compared to an existing constraint-based method (RFCI) using data generated from several previously published Bayesian networks. The structural Hamming distances of the output models improved when using the proposed method compared to RFCI, especially for small sample sizes.
منابع مشابه
Causal discovery from databases with discrete and continuous variables1
Causal discovery is widely used for analysis of experimental data focusing on the exploratory analysis and suggesting probable causal dependencies. There is a variety of causal discovery algorithms in the literature. Some of these algorithms rely on the assumption that there are no latent variables in the model; others do not provide a scoring metric to easily compare the reliability of two can...
متن کاملLearning Sparse Causal Models is not NP-hard
This paper shows that causal model discovery is not an NP-hard problem, in the sense that for sparse graphs bounded by node degree k the sound and complete causal model can be obtained in worst case order N independence tests, even when latent variables and selection bias may be present. We present a modification of the well-known FCI algorithm that implements the method for an independence ora...
متن کاملCausal Discovery from Databases with Discrete and Continuous Variables
Bayesian Constraint-based Causal Discovery (BCCD) is a state-of-the-art method for robust causal discovery in the presence of latent variables. It combines probabilistic estimation of Bayesian networks over subsets of variables with a causal logic to infer causal statements. Currently BCCD is limited to discrete or Gaussian variables. Most of the real-world data, however, contain a mixture of d...
متن کاملTesting Edges by Truncations
We consider the problem of testing whether two variables should be adjacent (either due to a direct effect between them, or due to a hidden common cause) given an observational distribution, and a set of causal assumptions encoded as a causal diagram. In other words, given a set of edges in the diagram known to be true, we are interested in testing whether another edge ought to be in the diagra...
متن کاملA Bayesian Approach to Causal Discovery
We examine the Bayesian approach to the discovery of directed acyclic causal models and compare it to the constraint-based approach. Both approaches rely on the Causal Markov assumption, but the two di er signi cantly in theory and practice. An important di erence between the approaches is that the constraint-based approach uses categorical information about conditional-independence constraints...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD
دوره 2017 شماره
صفحات -
تاریخ انتشار 2017